105 research outputs found
De-identifying Swedish clinical text - refinement of a gold standard and experiments with Conditional random fields
In order to perform research on the information contained in Electronic Patient Records (EPRs), access to the data itself is needed. This is often very difficult due to confidentiality regulations. The data sets need to be fully de-identified before they can be distributed to researchers. De-identification is a difficult task where the definitions of annotation classes are not self-evident. We present work on the creation of two refined variants of a manually annotated Gold standard for deidentification, one created automatically, and one created through discussions among the annotators. These are used for the training and evaluation of an automatic system based on the Conditional Random Fields algorithm. Evaluating with four-fold cross-validation on sets of around 4-6 000 annotation instances, we obtained very promising results for both Gold Standards; F-score around 0.80 for a number of experiments, with higher results for certain annotation classes. Moreover, 49 false positives that were verified true positives were found by the system but missed by the annotators. Our intention is to make this Gold standard available for other research groups in the future. Despite being slightly more timeconsuming we believe the manual consensus gold standard is the most valuable for further research. We also propose a set of annotation classes to be used for similar de-identification tasks.
Facts and Fabrications about Ebola: A Twitter Based Study
Microblogging websites like Twitter have been shown to be immensely useful
for spreading information on a global scale within seconds. The detrimental
effect, however, of such platforms is that misinformation and rumors are also
as likely to spread on the network as credible, verified information. From a
public health standpoint, the spread of misinformation creates unnecessary
panic for the public. We recently witnessed several such scenarios during the
outbreak of Ebola in 2014 [14, 1]. In order to effectively counter the medical
misinformation in a timely manner, our goal here is to study the nature of such
misinformation and rumors in the United States during fall 2014 when a handful
of Ebola cases were confirmed in North America. It is a well known convention
on Twitter to use hashtags to give context to a Twitter message (a tweet). In
this study, we collected approximately 47M tweets from the Twitter streaming
API related to Ebola. Based on hashtags, we propose a method to classify the
tweets into two sets: credible and speculative. We analyze these two sets and
study how they differ in terms of a number of features extracted from the
Twitter API. In conclusion, we infer several interesting differences between
the two sets. We outline further potential directions to using this material
for monitoring and separating speculative tweets from credible ones, to enable
improved public health information.Comment: Appears in SIGKDD BigCHat Workshop 201
Something Old, Something New — Applying a Pre-trained Parsing Model to Clinical Swedish
Proceedings of the 18th Nordic Conference of Computational Linguistics
NODALIDA 2011.
Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa.
NEALT Proceedings Series, Vol. 11 (2011), 287-290.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/1695
Uncertainty Detection as Approximate Max-Margin Sequence Labelling
This paper reports experiments for the CoNLL 2010 shared task on learning to detect hedges and their scope in natural language text. We have addressed the experimental tasks as supervised linear maximum margin prediction problems. For sentence level hedge detection in the biological domain we use an L1-regularised binary support vector machine, while for sentence level weasel detection in the Wikipedia domain, we use an L2-regularised approach. We model the in-sentence uncertainty cue and scope detection task as an L2-regularised approximate maximum margin sequence labelling problem, using the BIO-encoding. In addition to surface level features, we use a variety of linguistic features based on a functional dependency analysis. A greedy forward selection strategy is used in exploring the large set of potential features.
Our official results for Task 1 for the biological domain are 85.2 F1-score, for the Wikipedia set 55.4 F1-score. For Task 2, our official results are 2.1 for the entire task with a score of 62.5 for cue detection. After resolving errors and final bugs, our final results are for Task 1, biological: 86.0, Wikipedia: 58.2; Task 2, scopes: 39.6 and cues: 78.5
Is artificial data useful for biomedical Natural Language Processing algorithms?
A major obstacle to the development of Natural Language Processing (NLP)
methods in the biomedical domain is data accessibility. This problem can be
addressed by generating medical data artificially. Most previous studies have
focused on the generation of short clinical text, and evaluation of the data
utility has been limited. We propose a generic methodology to guide the
generation of clinical text with key phrases. We use the artificial data as
additional training data in two key biomedical NLP tasks: text classification
and temporal relation extraction. We show that artificially generated training
data used in conjunction with real training data can lead to performance boosts
for data-greedy neural network algorithms. We also demonstrate the usefulness
of the generated data for NLP setups where it fully replaces real training
data.Comment: BioNLP 201
Mixing and blending syntactic and semantic dependencies
Our system for the CoNLL 2008 shared
task uses a set of individual parsers, a set of
stand-alone semantic role labellers, and a
joint system for parsing and semantic role
labelling, all blended together. The system
achieved a macro averaged labelled F1-
score of 79.79 (WSJ 80.92, Brown 70.49)
for the overall task. The labelled attachment
score for syntactic dependencies was
86.63 (WSJ 87.36, Brown 80.77) and the
labelled F1-score for semantic dependencies
was 72.94 (WSJ 74.47, Brown 60.18)
Identifying Mentions of Pain in Mental Health Records Text: A Natural Language Processing Approach
Pain is a common reason for accessing healthcare resources and is a growing
area of research, especially in its overlap with mental health. Mental health
electronic health records are a good data source to study this overlap.
However, much information on pain is held in the free text of these records,
where mentions of pain present a unique natural language processing problem due
to its ambiguous nature. This project uses data from an anonymised mental
health electronic health records database. The data are used to train a machine
learning based classification algorithm to classify sentences as discussing
patient pain or not. This will facilitate the extraction of relevant pain
information from large databases, and the use of such outputs for further
studies on pain and mental health. 1,985 documents were manually
triple-annotated for creation of gold standard training data, which was used to
train three commonly used classification algorithms. The best performing model
achieved an F1-score of 0.98 (95% CI 0.98-0.99).Comment: 5 pages, 2 tables, submitted to MEDINFO 2023 conferenc
Development of a Knowledge Graph Embeddings Model for Pain
Pain is a complex concept that can interconnect with other concepts such as a
disorder that might cause pain, a medication that might relieve pain, and so
on. To fully understand the context of pain experienced by either an individual
or across a population, we may need to examine all concepts related to pain and
the relationships between them. This is especially useful when modeling pain
that has been recorded in electronic health records. Knowledge graphs represent
concepts and their relations by an interlinked network, enabling semantic and
context-based reasoning in a computationally tractable form. These graphs can,
however, be too large for efficient computation. Knowledge graph embeddings
help to resolve this by representing the graphs in a low-dimensional vector
space. These embeddings can then be used in various downstream tasks such as
classification and link prediction. The various relations associated with pain
which are required to construct such a knowledge graph can be obtained from
external medical knowledge bases such as SNOMED CT, a hierarchical systematic
nomenclature of medical terms. A knowledge graph built in this way could be
further enriched with real-world examples of pain and its relations extracted
from electronic health records. This paper describes the construction of such
knowledge graph embedding models of pain concepts, extracted from the
unstructured text of mental health electronic health records, combined with
external knowledge created from relations described in SNOMED CT, and their
evaluation on a subject-object link prediction task. The performance of the
models was compared with other baseline models.Comment: Accepted at AMIA 2023, New Orlean
Investigating bullying as a predictor of suicidality in a clinical sample of adolescents with Autism Spectrum Disorder
For typically developing adolescents, being bullied is associated with increased risk of suicidality. Although adolescents with autism spectrum disorder (ASD) are at increased risk of both bullying and suicidality, there is very little research that examines the extent to which an experience of being bullied may increase suicidality within this specific population. To address this, we conducted a retrospective cohort study to investigate the longitudinal association between experiencing bullying and suicidality in a clinical population of 680 adolescents with ASD. Electronic health records of adolescents (13–17 years), using mental health services in South London, with a diagnosis of ASD were analyzed. Natural language processing was employed to identify mentions of bullying and suicidality in the free text fields of adolescents' clinical records. Cox regression analysis was employed to investigate the longitudinal relationship between bullying and suicidality outcomes. Reported experience of bullying in the first month of clinical contact was associated with an increased risk suicidality over the follow‐up period (hazard ratio = 1.82; 95% confidence interval = 1.28–2.59). In addition, female gender, psychosis, affective disorder diagnoses, and higher intellectual ability were all associated with suicidality at follow‐up. This study is the first to demonstrate the strength of longitudinal associations between bullying and suicidality in a clinical population of adolescents with ASD, using automated approaches to detect key life events within clinical records. Our findings provide support for identifying and dealing with bullying in schools, and for antibullying strategy's incorporation into wider suicide prevention programs for young people with ASD. Autism Res 2020, 13: 988‐997. © 2020 The Authors. Autism Research published by International Society for Autism Research published by Wiley Periodicals, Inc. LAY SUMMARY: We investigated the relationship between bullying and suicidality in young people with autism spectrum disorder (ASD). We examined the clinical records of adolescents (aged 13–18 years old) with ASD in South London who were receiving treatment from Child and Adolescent Mental Health Services. We found that if they reported being bullied in the first month after they were first seen by mental health services, they were nearly twice as likely to go on to develop suicidal thoughts or behaviors
Sample Size in Natural Language Processing within Healthcare Research
Sample size calculation is an essential step in most data-based disciplines.
Large enough samples ensure representativeness of the population and determine
the precision of estimates. This is true for most quantitative studies,
including those that employ machine learning methods, such as natural language
processing, where free-text is used to generate predictions and classify
instances of text. Within the healthcare domain, the lack of sufficient corpora
of previously collected data can be a limiting factor when determining sample
sizes for new studies. This paper tries to address the issue by making
recommendations on sample sizes for text classification tasks in the healthcare
domain.
Models trained on the MIMIC-III database of critical care records from Beth
Israel Deaconess Medical Center were used to classify documents as having or
not having Unspecified Essential Hypertension, the most common diagnosis code
in the database. Simulations were performed using various classifiers on
different sample sizes and class proportions. This was repeated for a
comparatively less common diagnosis code within the database of diabetes
mellitus without mention of complication.
Smaller sample sizes resulted in better results when using a K-nearest
neighbours classifier, whereas larger sample sizes provided better results with
support vector machines and BERT models. Overall, a sample size larger than
1000 was sufficient to provide decent performance metrics.
The simulations conducted within this study provide guidelines that can be
used as recommendations for selecting appropriate sample sizes and class
proportions, and for predicting expected performance, when building classifiers
for textual healthcare data. The methodology used here can be modified for
sample size estimates calculations with other datasets.Comment: Submitted to Journal of Biomedical Informatic
- …